The Annals of Statistics 

2006, Vol. 34, No. 4, 1827-1849 

DOI: 10.1214/009053606000000425 

© Institute of Mathematical Statistics, 2006 

ON THE BENJAMINI HOCHBERG METHOD 

By J. A. Ferreira 1 and A. H. Zwinderman 
University of Amsterdam 

We investigate the properties of the Benjamini-Hochberg method 
for multiple testing and of a variant of Storey's generalization of 
it, extending and complementing the asymptotic and exact results 
available in the literature. Results are obtained under two different 
sets of assumptions and include asymptotic and exact expressions and 
bounds for the proportion of rejections, the proportion of incorrect 
rejections out of all rejections and two other proportions used to 
quantify the efficacy of the method. 

1. Introduction. Let X = {X\, X2, ■ ■ ■ , X m } be a set of m random vari- 
ables defined on a probability space (Q,J-,P) such that, for some positive 
integer m,Q < m, each of X±,X2, ■ ■ ■ ,X mo has distribution function (d.f.) F 
and X mo+ i, . . . , X m all have d.f.'s different from F, and consider the problem 
of choosing a set 1Z C X in such a way that the random variable (r.v.) 

rr ^ m 



R m V 1 ' 



where R m = #7£ and S m = {X\, . . . , X mo }), is guaranteed to be small 

in some probabilistic sense. In more ordinary language, the problem is that 
of discovering observations in X which do not have d.f. F without incurring 
a high proportion of incorrect rejections — the proportion IIi j?n of rejected 
observations which in fact come from F. 

Benjamini and Hochberg [2] have proposed a method of choosing 1Z specif- 
ically aimed at discovering r.v.'s taking values in the interval [0, 1] that tend 
to be smaller than standard uniform r.v.'s and which, given 5 > 0, guar- 
antees that E(Hi tm ) < 5 under certain conditions. The method consists of 
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fixing q G [0,1], computing 



(1.1) 



i? m = max i : X i:m < q— 



m 



where < X\ :m <■■■ < X m:m < 1 denote the order statistics of X, and set- 
ting 1Z = {Xi :m , . . . ,Xn m - m }. In its simplest form, the Benjamini-Hochberg 
theorem states that if 1Z is chosen according to this procedure and X\ , X<i , 
. . . ,X mo are independent and standard uniform and independent of X mo+ i, 
. . . , X m , then I?(n ljm ) = 57, where 7 := mo/m, a property usually expressed 
by saying that the Benjamini-Hochberg procedure controls the false discov- 
ery rate — the number £^(n l m ). 

The Benjamini-Hochberg procedure seems somewhat mysterious from 
(1.1) alone; an explanation as to why it does work in the appropriate cir- 
cumstances will be given below. 

Benjamini and Hochberg [2] formulated their ideas in the context of mul- 
tiple testing. Here, rejecting observations in X is interpreted as rejecting 
hypotheses among m null hypotheses Hq,...,!!™, of which only the first 
mo are true, on the basis of p-values X\, . . . , X m that result from the obser- 
vation of certain test statistics Y\,... ,Y m . Although the hypotheses tested 
may be arbitrary, the p-values are assumed to be given by Xi = 1 — Fi(Yi), 
where F{ is the d.f. of Yi under H^; furthermore, in the most general case 
considered by Sarkar [15] Xi,X%,. ■ ■ ,X mo need not be independent and are 
only assumed to be sub-uniform in the sense that P(Xi < x) < x for all 
x £ [0, 1]. [Note: In general, P(X{ < x) > x, rather than P(Xi < x) < x: If 
F is a d.f. and F~ 1 (u) = mm{t:F(t) > u} then F(t) >u^t> F -1 (it), 
and F(F~ 1 (u)-) < u; therefore, P(JSf 4 < x) = P(Fi(Yi) >l-x) = P(Yi > 
i ? i ~ 1 (l — x)) = 1 — i ? i(i ? j _1 (l — x)— ) > x with equality for all x if and only 
if Fi is continuous. Thus (see, e.g., the proof of Theorem 2.1), under the 
assumptions usually made in the literature, the Benjamini-Hochberg the- 
orem actually states that £7(n 1)m ) > qj. If the method is modified by us- 
ing strict inequality in (1.1) and the p-values are defined by Xi = Fi(Yi) 
(which represents no loss of generality), then £7(n 1)Tn ) < 97 with equality 
if Yi, . . . , Y mQ are continuous, because P(Xi < x) = P{Fi{Yi) < x) = P{Yi < 



Most common multiple testing procedures tend to be either too conser- 
vative or too liberal — they either miss the chance of detecting many false 
hypotheses in the fear of incorrectly rejecting one hypothesis (the case of the 
Bonferroni method), or they incur a very large proportion of false positives 
in the greed of finding significant results (the case of "uncritical testing," 
in which all hypotheses yielding p-values below q, say, are rejected). Ben- 
jamini and Hochberg's [2] motivation in proposing to control the false dis- 
covery rate was to achieve a balance between these two extremes: in many 
problems — especially in those involving many hypotheses — it is acceptable 



Fr\x)) = F t (Fr\ x )-)<x.] 
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to incorrectly reject some hypotheses as long as they make up only a small 
proportion of all the hypotheses rejected; and allowing for this proportion 
of false positives yields a substantial proportion of true discoveries. We were 
led to the Benjamini-Hochberg approach to multiple testing by considering 
one such problem: "gene discovery" in the context of heart disease, where 
the objective is to discover genetic variables which determine or influence a 
number of phenotypical variables. "Gene expression" studies provide other 
examples of problems where the control of the false discovery rate is impor- 
tant; see, for example, Tusher, Tibshirani and Chu [22], Dudoit, Schaffer and 
Boldrick [7], Reiner, Yekutieli and Benjamini [14], Fan et al. [8] and McLach- 
lan, Do and Ambroise [12]. Some of these authors actually use variants of 
the Benjamini-Hochberg method based on estimating the proportion of in- 
correct rejections out of all rejections that result from rejecting all p-values 
below i as a function of t, a procedure which for t = qR m /m is equivalent 
to Benjamini and Hochberg's. 

As outlined in our first paragraph, the problem of choosing TZ in a way 
that controls Tli tm seems to arise in other contexts as well. For instance, in 
data analyses of "contaminated" data, where a majority of elements form 
a sample from some population but a minority do not, TZ records those 
observations thought to be "outliers," and it is naturally of interest to seek 
a choice of TZ that keeps Tli^ m small so that not too many of the good 
observations are thrown away. In the more general formulation, the variables 
X mo+ i, . . . ,X m need not behave in a more extreme way than X\, . . . ,X mo ; 
they simply have d.f.'s that differ from F, and the problem, then, can be 
further translated into that of identifying a mixture of two populations given 
the knowledge of the law describing one of them. This is a useful point of view 
in that it helps us to put the Benjamini-Hochberg method into a context 
of goodness of fit, which is not just more general but also illuminating as 
far as the workings and the limitations of the method are concerned. More 
specifically, the problem could, in principle, be solved by choosing TZ as 
the subset of X for which a goodness of fit test of F performed with X \ TZ 
yields the smallest discrepancy among the discrepancies based on all subsets 
of X. As we shall see, what the Benjamini-Hochberg method does is just 
this, except that the subsets considered are of the form {Xi :m , . . . , X r - m } for 
some r. 

Let H m denote the empirical d.f. of X; then (the second identity here is 
known and has been used before in this context; e.g., see [1] and [9]) 




k=r 
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(1.2) 
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k=r 
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k=r 
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m J 



H m (qk/m) - qk/m l-q 
qk/m ~ q 



H m {t)-t 1- 
max > 

-qr /m,...,g(m— 1) /m,q t q 

r = 0, 1, . . . , m, so the procedure rejects the r lower order statistics if and 
only if 

H m (t)-t l-q 
max > 



and 



t=qr /m,...,q(rn—l) /m,q 



H m {t) - 1 1 
max < — 

t=q(r+l)/m,...,q(m— l)/m,q t 



In other words, the r lower order statistics are rejected whenever the good- 
ness of fit statistics 

q n H m {t) — t H m (t) — t 
(1.6) max ~ max , 

t=qk/m,...,q(m—l)/m,q t t£[qk/m,q] t 

k = 1, . . . , m, indicate a relatively big discrepancy between H m and the uni- 
form d.f. over [qr/m,q], and a relatively small one over [q(r + l)/m,q], indi- 
cating that most of the nonuniform observations lie in the interval (0, qr/m]; 
the standard for comparison, (1 — q)/q, corresponds to the biggest discrep- 
ancy of (H m (t) — t)/t one could get at t = q, and the choice of q determines 
the interval (0, q] to be "scanned" for discrepancies. 

The function on the right-hand side in (1.3) is Renyi's statistic, a well- 
known goodness of fit statistic for testing the uniform distribution; it is 
a one-sided statistic of the Kolmogorov-Smirnov type, devised to detect 
distributions with too much mass in the lower tail, scaled by the standard 
uniform distribution in order to inflate the discrepancies that occur at lower 
values. 

From the version of the "ballot theorem" given on page 113 of [11], 
we know that if X±, . . . ,X m are independent standard uniform r.v.'s, then 
P(H m (t) < t/q Vi G (0,q\) = 1 - q for all m G N and q G [0, 1], from which it 
follows that the probability that the Benjaminr-Hochberg method yields no 
rejections satisfies P{R m < 1) ~ 1 — P(sup 0<t<q (H m (t) — t)/t > (1 — q)/q) = 
1 — q. Thus, if the hypothesis that the variables are a standard uniform 
random sample is taken as the null and the type I error is defined as the 
incorrect rejection of at least one p-value, q can be interpreted as the ap- 
proximate significance level. (We thank a referee for posing a question which 
led to this observation.) 
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The connection between the Benjamini-Hochberg procedure and good- 
ness of fit has been hinted at by other authors (e.g., [5, 6, 13]), but this 
seems to be the first explicit link to be exhibited. In their seminal work 
Benjamini and Hochberg [2] provided some justification of the appropriate- 
ness of their method, and so did Storey [18] in connection with one of the 
variants mentioned above; the present explanation provides further insight 
into the workings of the method, as well as to its domain of applicability. 

The objective of this article is to investigate the main properties of the 
Benjamini-Hochberg method, extending and complementing the results of 
Benjamini and Hochberg [2], Genovese and Wasserman [9] and Storey, Taylor 
and Siegmund [19], focusing particularly on its asymptotic aspects as m — > 
oo, mi := m — tuq — ► oo and 7 remains fixed. In Section 2 we extend the 
Benjamini-Hochberg theorem and prove some results on the convergence in 
probability of R m to infinity, and of Hi tm to (77, in what is essentially the 
setting originally adopted by Benjamini and Hochberg [2]: X±,... ,X mo are 
independent and sub- uniform, and independent of X mQ +i, . . . ,X m , but the 
latter can be anything. This set of assumptions is very asymmetric in that 
too much is assumed from one set and nothing is assumed from the other, 
but the results are potentially useful in a number of practical situations. In 
fact, the proofs of Section 2 go through if the assumptions just stated hold 
conditionally on a sigma field Q C J 7 , hence if Xi, . . . , X mo are, for each mo, 
part of an infinite exchangeable sequence independent of X mo+ i, . . . ,X m , 
and so the results are more general than stated. (See [4] and [15] for the 
Benjamini-Hochberg theorem under general dependence conditions. Recent 
parallel developments in this area can be found in [10] and [17].) 

But more interesting, perhaps, is that the results proved in Section 2 actu- 
ally hold in an asymptotic way under the rather general assumptions intro- 
duced by Storey, Taylor and Siegmund [19]. These assumptions, which essen- 
tially amount to the convergence of the sequence of empirical distributions, 
are more balanced and seem more realistic. In our work in Sections 3 and 4 
we adopt essentially the assumptions of [19] and obtain results which are 
parallel to theirs, namely about the convergence in probability of R m /m 
and ni im ; our approach allows some extensions and, we think, the quickest 
and most transparent treatment of the main properties of the Benjamini- 
Hochberg method. The results of Section 3 are extended in Section 4 to a 
slight modification of Storey's [18] generalization of the Benjamini-Hochberg 
method, whose practical relevance and range of applicability are illustrated 
by the statements of Theorem 4.1. 

Before proceeding, let us introduce two statistical measures often used to 
assess the performance of the Benjamini-Hochberg method, 

nRm S m Rm S m . mo S m 

2,m = = and n 3>m = 1 



m — mo mi (m — R m ) V 1 



G 
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The first is the proportion of correctly rejected observations out of {X mo+ i, 
...,X m }, and its expected value will be called average power, or simply 
power; it is the most popular and perhaps most straightforward efficacy mea- 
sure considered in the literature. The second is the proportion of incorrect 
nonrejections among nonrejections and has been introduced by Genovese 
and Wasserman [9] as a dual quantity to IIi jm ; its expected value is called 
false nondiscovery rate. The latter seems to be a particularly useful concept 
in the context of "outlier detection" mentioned above, where one would like 
to keep only a small number of outliers out of all the observations judged to 
have come from F; in the multiple testing context it seems more difficult to 
interpret than average power; but see Proposition 2.3 for an interpretation 
in terms of the Benjamini-Hochberg method. 

2. Results in the original setting. Unless stated otherwise, X±,... ,X mo 
will be assumed independent and such that P(Xi < x) < F(x) := x for x £ 
[0, 1], and independent of {X mQ+ i, . . . , X m }. In the sequel, by X^ • we shall 
mean the ith order statistic of the set X^' := X\{X±, . . . ,Xj}, j = 1, . . . , mo, 
and by Rm(X^) the number of rejections that result from applying to 
Xv) the modified form of the Benjamini-Hochberg procedure obtained by 
replacing i on the right-hand side of the inequalities in (1.1) by i + j; we shall 
also write R m = Rm\x), Xi- m = xf^, X = X^ . By the standard uniform 
case, we mean the case where X\, . . . ,X mo are standard uniform r.v.'s. 

Our first result gives upper bounds on the moments of IIi >rn and S m , and 
contains Benjamini and Hochberg's [2] theorem as a special case. 

Theorem 2.1. We have 

(2.1) tf[(n ltm )*] <£ te) . - f q !I^i±l)E[( j + Rg\x^)r k ] 

and 

(2.2) E(Sj)<j:( q ^) - ( q ^^)E[( j + Rg\X^)y] 

for k = 1, 2, ... , mo, the inequalities being achieved for all q only in the stan- 
dard uniform case. 

Proof. We only prove (2.1); the proof of (2.2) is very similar. It will be 
evident that there is no loss of generality in assuming that X\ , . . . , X mo have 
the same distribution. Observe first that, for < r < m (setting Xo :m = 0), 
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Rm = t X r - m < q^- A X s - rn > q-?- Vs > r, and that, for 1 < r < m 



X x < q—,R m = r 
m 



r r s 

Xi < q—,X r:m < q—,X s:m > q— Vs > r 
m m m 



Xi < q—,X±2 X:m _ x < q—,X K a l x , m _ x > q— Vs > r 
m m m 

X x < g^l'^ < ^X^l_ x > q S -±± Vs > r - 1 



Similarly, 



{x 1 <q-,R&Xx\{X 1 }) = r-l 
y_ m 



X x <q—,...,Xj <q—,Rm = r 
m m 



{x x <q-,...,Xj< q-,R$(xW) =r-j) 
[mm J 



for r = j, j + 1, . . . ,m, j = 0, 1, . . . , mo- Thus, since {X x , . . . , X,} and X^' 
are independent if j < mo, we have 



ni,m)*]=E^(^{^W=r} 



r=l 



1 



m 



, s=l / 



2_/ r fc 

r=l 

,f A mo-(mo-j + l) 

■2^2^ ^ ^l 1 {Xi< g r/m,...,X J <gr/m} 1 {ll m (X)=r}J 

r=l j'=l 

Ay m --(m -j + l) 

j=l r=j 



fe rn 

EE 

j=l r =j 



X E ^{X 1 <qr/m,...,X j <qr/m,R ( i i\x^1)=r-j} 

m ■ ■ ■ (m - j + 1) 



X £7 [ 1 {Jri<9r/m,...rXi<5r/»n}]- E7 [ ]L {^(xy))=r-i}] 
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^ m • ■ ■ (ttoq - j + 1) j 



j=l r =j 
A' 



A' 



m I \ m 



Z^ r 1 {^(X(i))=r-j} 



_ E (,^) . . . L^i±i) Em + 4?(x">)r 

equality holding for all g if and only if F is standard uniform. □ 

Setting k = 1 at each step of the argument yields what is perhaps the 
simplest and most elementary available proof of the Benjamini-Hochberg 
theorem; Sarkar [15] gives a proof using similar ideas in a more general 
setting, and Storey, Taylor and Siegmund [19] give another simple proof 
based on the optional stopping theorem. 

As the following proposition shows, Theorem 2.1 with k > 2 can be used 
to derive conclusions about the asymptotic properties of IIi )m ; the proof is 
given in the Appendix. 

P 

Proposition 2.2. If R m ^oo, then 

(2.3) limsup£[(n ljm ) fc ] < { qi )\ fc£N; 

moreover, in the standard uniform case we have 

P P 

(2.4) R m — > oo if and only if Hi^ m —* Ql- 

Remarks, (i) One practical rule that follows from (2.4) is this: If with 
large m one rejects a substantial (0.1, say, as opposed to 0.001) proportion 
R m /m of the sample (indicating R m — ► oo), then one can be sure that ITi im , 
the proportion of incorrect rejections out of all rejections, is not only near, 
but is practically equal to, the false discovery rate E(Hi >m ) = qj. 

(ii) Besides the false discovery rate, some authors consider E(S m )/ E(R m V 
1), sometimes called "marginal false discovery rate" (e.g., [20]). When k = 1, 

(2.2) yields E(S m )/E[(l + R$ (X^)} < q<y with equality in the standard 
uniform case, which almost represents the control of E(S m )/E(R m V 1). 

Since, as shown in the proof of Proposition 2.2, Rm\x^) is asymptotically 
no smaller than R m , it follows that in the standard uniform case 

lim e( m — ^ = qj = lim 



m— >oo 



tfmViy m-oo i + E (r£\x( 1 ))) 
< lim inf (^m) ^ ^ m _ 



' oo 



1 + E(R m ) ~ ™-oo E{R m V 1) 
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Because average power is an absolute quantity, there is nothing one can 
say about it without some information on X mo+ i, . . . ,X m . More precisely, 
all that one can conclude from Proposition 2.2 is that, because R m /m can 
be anything from to 1 (as can be seen from the results of Section 3), 



Rm, Sm. 1 Rm. I . $m 



1 



mi 1 — 7 m V Rm V 1 

(hence its expected value) is somewhere between and -j^p- > 1, which, 
besides the truism that average power is between and 1, only tells us that 
Rm/m is asymptotically bounded above by < 1- 

In contrast, E(Jlz, m ), the false nondiscovery rate of Genovese and Wasser- 
man [9], provides a relative measure of the performance of the Benjamini- 
Hochberg method — it assesses the efficacy of the method in terms of the 
number of rejections — for which reason one can use a statement like (2.3) 
to obtain a meaningful upper bound on II 3im (or on its moments): 



Proposition 2.3. Suppose 7 G (0, 1] . Then 

\fci 

1 



(2.5) E[(U 3>m ) 1 ] < (1 - 7 ) z + \ M € N; 
moreover, if < q < 1, 

(2.6) R m ^ 00 limsup£[(n 3 , m )'] <(l- 7 y, ^ 



Proof. If R m = 0, then n 3/m = 1 — 7; if R m = m, then II 3im = 0; and if 
, > 0, we have n 3 , m = 1 - < 1 - 7 & ^ < 7. Thus, 

£[(n 3 , m )<] = (1 - ~/) l P(R m = 0) + J E(n 3/m i {5m/Rm < 7} i {ifm>0} ) 
+ £(n 3 , m i{s m /i? m>T }i{fi ra >o}) 

< (1 - j) l P(Rm = 0) + (1 - 7)^(^m > 0) + S(l {niim>7} ) 



(l- 7 ) i + P(n lim >7)<(l- 7 )' + 



^[(ni, 



\k] 



in 1 



By (2.5) and (2.3), limsup^^ E[(U 3>m ) 1 ] < (1 - 7)' + <? fc , and since k G N 
is arbitrary (2.6) follows. □ 

In words, (2.6) says that if R m — > 00, then, asymptotically, the expected 
proportion of incorrect nonrejections in the Benjamini-Hochberg procedure 
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with arbitrary q G [0,1) does not exceed the proportion 1 — 7 of observa- 
tions that ideally one would like to reject. From a practical point of view, 
this seems to be a nice "unbiasedness" property of the Benjamini-Hochberg 
method, one that should be required from procedures for selecting 1Z in gen- 
eral: at least in the limit, the proportion of false hypotheses among those 
that pass unnoticed does not exceed the proportion of false hypotheses that 
would go unnoticed if one simply considered all hypotheses true from the 
start — if one did not even bother about investigating them — which is just an- 
other way of saying that we are better off applying the Benjamini-Hochberg 
procedure than doing nothing. 

For other results on Ii^ )rn and a definition of unbiasedness we refer the 
reader to [16]. 



3. Asymptotic results under dependence. In what follows we assume 
that 



on) 



(3.1) 

and 
(3.2) 



Fm (x) 



to 



1 



k=l 



G mi (x) = — J2 Hx k <x}^G{x) 

TOl ^ 



fc=mo+l 

uniformly in x G [0,1], where G is a d.f. concentrated on [0,1]. These are 
weak versions of the Glivenko-Cantelli theorem; a result at the end of this 
section gives some sufficient conditions for them to hold. 

The following theorem extends Theorem 1 of [9], and in part also Theo- 
rem 5 of [19] in the case of the Benjamini-Hochberg method — as opposed 
to the case of Storey's [18] variant of it (see the Remark to Theorem 4.1 for 
a parallel result in the case of what we call the Benjamini-Hochberg-Storey 
method). 



Theorem 3.1. Under conditions (3.1) and (3.2) we have, for fc£N, 



where, for y > 0, 



-n (l-q) 



< lim sup E 

m— >oo 



R 



m 
TO 

m 
k 



Vv(y) = inf{x G [0, 1] : $Jx) < 1/y] 
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and 

tp q (x)= sup G ^ ~ t ; ig[0,1]. 

qx<t<q t 

In particular, 

(3-3) _^ p = p fo, 7 ) = ^__ 

whenever ip* ( ^yEgy ) := ( ^Egj ) = ( q (i-q) ) > which will be the case if 
and only if ip q does not assume the value ewer an interval. 

PROOF. By (1.2) we have 

-Rm .1 f H m {t) —t I 
>£•> = •; max > 1 

^ J [t=9[mx]/m,...,5(m- l)/m,q t Q 

for each x G ((r - l)/m,r/m], so with V? m) (^) = max t=g p ma; -]/ TOj ... ))9 



Since for each x > 



P 



m 

rri 



max < max 

t=q\mx\/m,...,q(m—l)/m,q t qx<t<q t 

1 p 

< — max \F mo (t) — t\ — ► 

qX qx<t<q 

and, similarly, max. t=q v mx -i / mj ..., g Gm i G ( f ) 0, we have 

,(m)/ \ F mo (t)—t G mi (t) - G(t) 
'{x) = 7 max + (1-7) max 

t=q\mx\/rn,...,q Z t=q\mx\/m,...,q t 

+ (1 — 7) max 

t=q \rnx~] /m,...,q t 

p n v G(i)-t 
— ► (1 — 7) max 



qx<t<q t 



Thus, 



l((i- 9 )/( g (i-7)),oc) Wz)) < liminf PU>J m )(x) > - 



- 1 

q 



< lim sup 



P(4 m) (^)>^-l) 



m^oo 

< l[(l- g )/(g(l- 7 )),oo)(^g(^)) 
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for almost all x, whence 
»l 



o 



kx l((i_ 9 )/ (9( i_ 7)))00) (V' g (a;))da: 



< liminf E 

m— >oo 



< limsupi? 



m— >oo 
1 



m 
m 



ki 



< kx' l[(i- g )/( g (i- T )),oo)(V'g(a;))^- 


Finally, from the definition of ip* and the fact that ifj q is a nonincreasing 



right-continuous function, we see that 

/ 

A*e[0,l]:V,(*)>(l-«)/(9(l-7))} 



,C(?(l-7)/(l-«)) 

fear Ml 



9(1-7) 



the analogous identity for ip q following similarly. □ 

Remarks, (i) Storey, Taylor and Siegmund [19] were the first to realize 
that conditions like (3.1) and (3.2) are sufficient to derive asymptotic re- 
sults about Storey's [18] variant of the Benjamini-Hochberg method. Storey, 
Taylor and Siegmund [19] actually assume only F(x) < x in (3.1); assuming 
F(x) = x, however, allows us to obtain simple and useful asymptotic expres- 
sions and bounds for Ili jm , H2, m and H3 jm (see the corollaries to the theorem 
below and Theorem 4.1 later on) without sacrificing much in the domain of 
practical applicability of the method. Storey, Taylor and Siegmund [19] also 
assume almost sure convergence in (3.1) and (3.2); our results could as easily 
be formulated in terms of almost sure convergence, but we find that conver- 
gence in probability is more natural in this context — it seems easier to meet 
and is still very relevant in applications. 

(ii) As pointed out by Genovese and Wasserman [9], (3.3) says that asymp- 
totically the Benjamini-Hochberg procedure rejects the observations (or hy- 
potheses whose p-values fall) below qp. Thus, compared with the method of 
"uncritical multiple testing" in which all hypotheses whose p-values fall be- 
low a critical value q are rejected, the Benjamini-Hochberg method always 
rejects a smaller proportion qp(q,^f) of hypotheses; on the other hand, be- 
cause qp(q,j) > q/m for large m, it typically rejects many more hypotheses 
than the corresponding Bonferroni procedure which, for finite m, consists of 
rejecting all observations below q/m. 
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(iii) Suppose (3.3) holds. Then p(q,j) > 4» max qx < t < q G ^ 1 > qjr^j 
for some x > 0, and it can be seen that 

(3.4) p(q,j) = q- 1 supjx e [0, 1] : 9^1. > -£^U, 

I x 9(1 -7) J 

or gp(g, 7) = sup{x e [0, 1] : -^^y < g}, in agreement with Theorem 5 of Storey, 
Taylor and Siegmund [19]. Furthermore, it can be verified from (3.4) that 
p(9,7) is left-continuous in q for fixed 7, and, using the condition expressed 
right after (3.3), that it is right-continuous at q if p(q, r y) > 0. Thus, 
q —> p(q,j) is continuous on (q',q") if p(q,~f) > Vg S (q',q"), in which 

case R m /m —> p(q, 7) uniformly on [q',q"]. (R m /m is a nondecreasing right- 
continuous function of q.) 

Examples, (i) Suppose G is degenerate at x$ £ [0, 1). Then ^(x) = —1 
if q < Xo, ipq(x) = 1/xq — 1 if qx < xq < q, and ip q (x) = 1/x — 1 if > xq. 
If x > 9(1 - 7)/(l - 97), that is, if l/x - 1< (1 - g)/[g(l - 7)], then 

^qi ^i-q) ) = ^' ano - hence p = 0. 

If < 9(1 — 7)/(l — then the equation = (1 — q)/[q(l — 7)] has 

a unique solution given by x = (1 — 7)/(l — 97), so (3.3) holds and 

V (? (l- 7 )\ (1-7) 



(3-5) P ^ l)= ^i^q\) 



(l-q)J (1-97)' 

Thus, p(q,j) > if sco < (1 — 7)/(l — 97) ; that is, p > if xo is not "too 
large" given the choice of q, in which case p is actually independent of xq, 
implying that asymptotically the proportion of rejections and the efficacy 
of the procedure depend only on 7 and on the choice of q and not on the 
exact position of Xq. In fact, it can be checked by substitution of (3.5) into 
the expressions of the limits obtained below in (3.8) that 112^ and 1 — IT3 ;m 
both converge in probability to 1 when xq < (1 — 7)/(l — 97). 

Since q can always be chosen so that xq < (1 — 7) /(l — (77), we see that in 
this case the Benjamini-Hochberg procedure can always be made to work 
in an asymptotically optimal way — in such a way that practically 100% of 
the observations from G will be spotted and ili im is kept at 57. In order to 
make use of this optimality in practice, one needs to choose q appropriately, 
but this is easy if 7 is not too large, because the histogram will then have 
the shape of a scaled down uniform density with a conspicuous peak at xq 
(which is why the problem is easy to solve even without using the Benjamini- 
Hochberg method). 

In the borderline case where xo = 9(1 — 7)/(l — 97), the theorem only 
tells us that R m /m is asymptotically somewhere between and the right- 
hand side of (3.5), because ip q (x) = (1 — q)/[q(l — 7)] = X0/9 holds for all 
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x £ (0, xo/q). In fact, if X\, . . . , X mo are independent standard uniform r.v.'s, 
we have 

/„ fi N Rm P (1-7) 

[6 - b) m 2(1-97)' 

To see this, note that, after being sorted in ascending order, the sam- 
ple consists of a proportion H m (xQ—) of ordered uniforms below xq, fol- 
lowed by m — mo copies of xq, which are in turn followed by the remaining 
m(l — H m (xo)) ordered uniforms, so that the proportion of correctly rejected 
observations is always given by (R m — S m )/m = max{i : m H m (xQ—) < i < 
m — m(l — H m (xo)),mxo/q < i}/m — H m (xQ—). This is ^ and equals 1 — 7 
if and only if mxo/q < m — m(l — H m (xo)), or F mo (xo) — xq > 0, which by our 

assumption happens with probability tending to 1/2. Thus, Rm ~ Sm — 



2 ' 

and therefore (3.6) holds by the fact that S m /R m — > 97, which follows by 

Proposition 2.2 (note that R m — > 00 necessarily). 

Finally, we observe that in this borderline case Tl2,m an d 1 — Il3 im converge 
in probability to 1/2 and 1 — (1 — 7) (1 — 97)/ [(1 — 97) + 7(1 — 9)], respectively, 
a calculation suggesting that H2, m is a more practically meaningful measure 
of efficacy than 1 — H3 jm . 

(ii) Assume that G is concave and 

G'(0) = lim ^ > where = (1 ~ gT j . 
+w 40 1 g(l-7) 

Since then G(0) = and > 1, there exists a unique t* > such that G(t*) = 
0t*; moreover, t* < q [because 1 > G(t*) = /3t* = and ^ > 1 ** < 

9], and it becomes evident on geometric grounds that 

max £W _ 1 = ^!2 _ 1 = /3 _k max ^)_! Vxe(0,f/ g ); 

i*<i<<? f t* 9S<i<g t 

thus, 

Kl-7)\ 1 \ 



q 



Alternatively, by (3.4), 9/3(9,7) is the smallest positive root of G(t) = fit, 
that is, 9/5(9,7) = t* . This was first proved by Genovese and Wasserman [9]. 

(iii) For an example where G is not necessarily concave take G(x) = px a + 
(l-p)x (3 , < x < 1, with a E (0,1), > 1, <p< 1. Then (G(t) -t)/t = 
pt a ~ x + (1 -p)^" 1 - 1, and from (3.4) we see that p > always exists and 

is uniquely determined by p{<qp) a ~ x + (1 — p)(qp) l3 ~ 1 — 1 = > provided 

9 > 0. 



Using Theorem 3.1, we can show that the conclusion of the Benjamini- 
Hochberg theorem holds very generally in an asymptotic sense: 
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Corollary 3.2. Under the conditions of Theorem 3.1, 

(3.7) ^4 P >o n 1)m ^ ?7 . 

771 

Proof. Since 

n Sm = EjSt ^{X^q^/m} = (1/mp) JljX^g^/m} 

^"i^Vl i? m Vl " 7 (i? m Vl)/m 

we have for arbitrary e G (0,p), r\ G (0, 1), 

^mo(g(p-e)) . finp(g(^ + g)) 
7 ; < Ul.m < 7 , 

with probability at least 1 — 7?, which by (3.1) proves (3.7). □ 

The following statements are all direct consequences of the preceding 
results. 

R P 

Corollary 3.3. Under the conditions of Theorem 3.1, — > p(q, 7) > 
implies 

Sin Pi \ 

> P(<?,7)<?7, 

(3.8) " 

(1-7) (1-/3(9,7)) 

Because R m /m, S m /m, IIi jm , r±2 im and Il3 jm are proportions, all the 
above statements about convergence in probability to a constant are equiv- 
alent to statements about convergence in the mean (of any order), as well 
as to statements about convergence of their moments. One consequence of 
this fact is that, under the conditions of Theorem 3.1, 



— — —> p > lim — - , J = lim E 

m m^oo E[R m ] m^oo 



Rn 



which implies that, asymptotically, the Benjamini-Hochberg method also 
controls the "marginal false discovery rate" E{S m ) / E(R m V 1) [briefly men- 
tioned in Remark (ii) to Proposition 2.2]. 

We shall finish this section by giving an example of a rather general situa- 
tion in which statements like (3.1) and (3.2) hold true uniformly in x; a sim- 
ilar result (with a stronger conclusion) for stationary ergodic sequences has 
been given by Tucker [21] , for example. Let £1, £2, ■ • ■ be a sequence of r.v.'s on 
[0, 1] with d.f.'s G«, G( 2 \ .... Since for each x G n (x) := n" 1 £f =1 %< x} 4 
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G(x) if and only if EG n (x) — > G(x) and E(G n (x) 2 ) — > G(x) 2 , we see that 
G n (x) — ► G(x) is equivalent to 

lim - Y (x) = G(x) and lim — V P(£ < x, & < x) = G(xf . 

i=l 

The following sufficient condition combines this observation with a condition 
that is much weaker than strong mixing. 

Proposition 3.4. Assume that, for each x, 

-in -. n 

G(x) := lim -Y G {i) (x) and G(x-) := lim ~Y G®(x-) 

i=l i=l 

exist, and there are subsequences {k n } and {ak n } such that k n — > oo, k n /n — > 
and ctk n — ► as oo, and 

sup max{|P(& < x,Cj < x) - P(fc < x)P{^ <x)\, 

\i—j\>kn 

\P(ti < x,tj <x)- Pfo < x)P(^ <x)\}< a kn . 



Then G n —> G uniformly. 

Proof. That G n (x) — ► G(x) for fixed x follows from the fact that 
hm^oo 4r YJifj < x)P& <x) = lim„_ 00 (I £^ =1 Pfo < = G(x) 2 
and from the inequalities 

Y n 1 n 



(k \ 2 1 n 
^ - +- E \P(Ci<x,^<x)-P(C i <x)P^ j <x)\ 

\i-]\>k n 



< 



n 



2 



+ a k n 



(the right-hand side of which goes to zero as n — > oo by assumption) . The 
analogous statement with < x in place of < x and x— in place of x fol- 
lows in the same way. Finally, that these pointwise results imply uniform 
convergence is a classical result. □ 
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4. A modification of the method. It has been observed by several authors 
that the Benjamini-Hochberg method tends to be conservative unless 7 is 
relatively close to 1. For if the value of 7 cannot be guessed at, the only 
way one can guarantee that E(Jl\ >m ) < 5 for a given d > is to apply the 
method with q = 5. But if 7 is actually smaller, say equal to 1/2, such a 
choice yields the overcautious bound E(ILi tm ) < 5/2 and the concomitant 
decrease in H2, m , which is an increasing function of q. Although in some 
practical situations this is hardly a problem because one has a reasonably 
good idea about the value of 7, from a general point of view it is still a 
shortcoming one would like to eliminate. 

These considerations have led Benjamini and Hochberg [3], Storey [18] 
and Storey, Taylor and Siegmund [19], among others, to propose and study 
variants of the Benjamini-Hochberg method which incorporate estimates of 
7. Our objective here will be to introduce another variant — very similar to 
Storey's — and to study some of its asymptotic properties. Questions related 
to the practical application of the method [e.g., the problem of choosing x 
in (4.1) below] will be considered elsewhere. Our assumptions and notation 
will be those of Section 3. 

The closer x gets to G~ 1 (l), the tighter the inequality H(x) = 72 + (1 — 

j)G(x) < jx + (1 — 7), or 7 < , becomes, which suggests taking 

(4.1) 7mW= mm — — , 

0<t<x 1 — t 

where x G (0, 1) is to be chosen, as an estimator of 7 [note that, for fixed 
x € (0, 1), 7 m (x) > with probability tending to 1]. (Storey's [18] estimator 
is defined by (1 — H m (x))/ (1 — x) for a given x.) Because of the convergence 
of H m to H, this 7 m (x) will typically be an overestimate of 7 in the sense 
that, given e > 0, 

(4.2) 7m (x) = mm — — > mm — e > 7 - e, 

0<t<x 1 — t 0<t<x 1 — 1 

with high probability if m is large enough. On the other hand, if we put 

k(x) = min ^ - , 1 6 (0, 1), 

v ' 0<t<x 1-t 

we see that 7 m (x) will typically not exceed 7 by more than (1 — j)k(x): 
7m (x) = 7 mm — + (1 - 7) mm 

(4.3) 

<e + 7 + (l -7)/c(s), 



with high probability for arbitrary e > if m is large enough. 
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For want of a better name, and because we are essentially using the 
ideas of Benjamini and Hochberg [2] and Storey [18], we shall refer to 
the procedure that consists of rejecting all observations smaller than or 
equal to X Rm ^ m ^ S )y. m , where R m (q m (x,S)) =max{i:X i:m < q m (x,5)^}, 
q m (x,5) = - ^ and 7 m (x) is defined by (4.1), as the Benjamini-Hochberg- 
Storey method. 

The variable R m of (1.1) will now be denoted by R m (q) in order to indi- 
cate its dependence on q in the Benjamini-Hochberg method, and similarly 
for the other variables; for instance, we shall write Ui }Tn (q) for IIi im , and 
^i,m(qm(x, 6)) for the proportion of incorrect rejections incurred by applying 
the Benjamini-Hochberg-Storey method. 

The following result shows that, with the modified method, one is able, 
in an asymptotic sense, to keep the false discovery rate under control and 
at the same time achieve greater average power than that provided by the 
Benjamini-Hochberg procedure. 

Theorem 4.1. Let 7 £ (0,1) and suppose 5 > 0, x £ (0,1), q'(x) and 
q"{x) can be chosen so that 



and 



Then 



i'(x) < 



7 + (1-7)k(x) 7 



< - < q"{x) 



R m (q) P 



rn 



p(g,7)>0 Vq£[q'(x),q"(x)]- 



(4.4) 



< ]imm£E[ILi >m (q m (x,S))] 



7 + (l — 7)k(x) rrwoo 

< ]xmmpE[ILi >m (q m (x,6))] < 5 



and 



(4.5) 



7 + (1 - 7)«(ac) ' 



1- 57/(7 + (l-7)«(z)) 



1-7 



< limME[U2,m(qm(x,5))] 



< HmswpE[Il2, m (qm(x,8))] < p -,7 



S 



(1-5) 



7' 7(1-7)' 



Proof. We know from Corollary 3.3 that we have 



R m {q) 



rn 



p(q,i) 



as well as 



S m (q) 



m 



p{q,i)qi 
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Vg € [q' (x) , q" (x)]; moreover, by Remark (iii) following Theorem 3.1, the 
convergence here is uniform on [q'(x), q"(x)\. It can be shown (and it is 
certainly known) that if f n — ► f and g n — > g uniformly, sup t \f(t) \ < oo and 
inft| 5 (t) | >0, then sup t \f n (t)/g n (t) - f(t)/g(t)\^> 0. Thus, 



(4.6) sup 

q' (x)<q<q" (x) 



S m {q) 

<?7 



sup \Iii^ m {q) - <rH -> 0. 



q'(x)<q<q"(x) 



R m {q)Vl 

Now fix e G (0,7), r] £ (0, 1) and m' so large that 
q ' {x) ~ 7 + (l- 7 ) K (x)+e 

(4.7) 

< g m (a;,<5) = — -- < < q"(x), 

with probability at least 1 — 77 if m > m' , which is possible by (4.2), (4.3) 
and our assumptions about q'(x) and q"(x). Then for m > m! 

7 

Ili,m(q m (x,8)) < sup \Tl\ ym (q) -q~f\ + 8- 



q'(x)<q<q"(x) 7 ~ £ 

holds with probability at least 1 — rj. Since e is arbitrarily small, this, com- 
bined with (4.6), proves the inequality on the right-hand side in (4.4) as well 
as its version in probability. The other inequality follows similarly. 
To prove (4.5), we use the inequalities 

RmiS/b + (1 - 7 )«(x) + e)) - 5 m («5/(7 + (1 - i)<x) + e)) 



< 



< 



m — mo 
R m (q m (x,5)) - S m (q m (x,5)) 
m — mo 

fi m (cV(7- £ ))-S m ((V(7- £ )) 
m — mo 



which hold whenever (4.7) is valid because R m (q) — S m (q) is nondecreasing 
in q, and the continuity of q — > p(q,~f) on [q'(x),q"(x)]. □ 



Remark. Under the assumptions of the theorem, we have q m (x,5) — > 

: = 7 +(i-W) and Rmiq Z {X ' 5)) ^ p(q(x,S),l); thus, asymptotically, 
the Benjamini-Hochberg-Storey method consists of rejecting all observa- 
tions below q(x,5)p(q(x,5),^/). 

Examples, (i) If G{x) = x a , x € [0,1], a e (0,1), then k(x) = (1 - 
x a )/(l — x) because t — ► (1 — t a )/(l — t) is decreasing. [In fact, if G has 
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a nonincreasing density function g, then 1 — G(t) = J t g(s)ds < (1 — t)g(t), 
or — g(t)(l — t) + (1 — G(t)) < 0, which implies that the derivative of t — > 
(1 — G(t))/(1 — t) is negative.] In this case [see Example (ii) following Theo- 
rem 3.1], it can be seen that p{q,^) = — 7)/(l — q^)) 1 ^ 1 ' ^ /q, which is 
always positive for q > 0, and so we have explicit expressions for the bounds 
in Theorem 4.1 that are valid for all x € (0, 1). Here we shall consider a = 0.1 
in two cases: (a) 7 = 0.5, (b) 7 = 0.9. The density h of H in case (a) is roughly 
in agreement with the histogram shown in Figure 5.8 of [12]; that of case 
(b) is much closer to the standard uniform density; they are both compared 
with the latter in Figure 1. 

The asymptotic average power and false discovery rate of the Benjamini- 
Hochberg procedure are shown in Figure 2 as functions of q. In case (a), 
the choice of q = 0.2 yields an asymptotic false discovery rate of 0.1 and an 
asymptotic average power of 0.784; in case (b), an asymptotic false discovery 
rate of 0.1 is guaranteed by taking q = 0.111, which yields an asymptotic 
average power of 0.614. 

Figure 3 illustrates the adherence of the bounds in (4.5) as a function 
of x when 5 (the upper bound of the false discovery rate) is fixed at 0.1; 
as just seen, in the ideal situation where 7 is known, the power obtained 
by controlling the false discovery rate at this level would be about 0.784 
and 0.614 in the cases 7 = 0.5 and 7 = 0.9, respectively. In each case, 
the asymptotic average power of the Benjamini-Hochberg-Storey procedure 
with q m (x,S) = 0.1/7 m (x) lies between the two curves of Figure 3 and is 
rather close to the maximum average power — achieved by setting q = 
in the Benjamini-Hochberg procedure — even for small values of x. However, 
since k(x) — > a as x f 1, the lower bound for asymptotic average power is 
always strictly below p{5/(j + (1 - -y)a),j)(l - ^ 7+(1 7 _ 7)a )/(l -7); which in 
turn is always strictly below the asymptotic average power of the Benjamini- 
Hochberg procedure with q = S/j. 




Fig. 1. Densities of the standard uniform distribution and of the d.f. H: left panel: 
a — 0.1, 7 = 0.5, right panel: a = 0.1, 7 = 0.9. 
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Fig. 2. Asymptotic average power and false discovery rate of the Benjamini-Hochberg 
procedure as functions of q: left panel: a — 0.1, 7 = 0.5, right panel: a = 0.1, 7 = 0.9. 



The left-hand side of (4.4) approaches 5 = 0.1 in a very similar way. 

(ii) Suppose G(x) = x a t ] p tXo) {x) + t [x0)0o) {x) for x ,aG (0,1). Then ip q (x) = 
(qx) 01-1 — 1 ii < x < xo/q and ip q (x) =0 if x > xo/q, so that p(q,j) is still 
positive and has the same expression as in (i) as long as (q(l — 7)/(l — 
qi^)) 1 /( 1_a ) < X q, which can always be arranged by choosing a small enough 
q. Since k(x) = (1 — x a )/{l — x) for x G [0, xq) and = for x G [xo, 1) > 
the lower bounds on the average power of the Benjamini-Hochberg-Storey 
procedure as a function of x coincide in this case with those shown in Fig- 
ure 3 over the interval [0, xo), but attain their maximum values over [xq, 1); 
analogously, the lower bounds on the false discovery rate attain the value of 
5 if x G [xq, 1). 

In this case, therefore, using q m (x,5) = 5/^ m (x) with x G [xo,l) in place 
of q in the Benjamini-Hochberg procedure and choosing 5 according to the 
conditions of Theorem 4.1 is asymptotically equivalent to taking q = 5/7 and 




Fig. 3. Upper and lower bounds on the asymptotic average power of the Ben- 
jamini-Hochberg-Storey procedure as functions of x as given in (4.5): left panel: a — 0.1, 
7 = 0.5, right panel: a = 0.1, 7 = 0.9. 
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thus corresponds to the ideal situation in which 7 is known, the required 
upper bound for the false discovery rate is 8, and the power is maximum. 

APPENDIX: PROOF OF PROPOSITION 2.2 

We first show that R m ^oo^ R$(XV>) £ 00 V j. Observe that H* m (t) := 
m- l Y?=2Mx 1 <t+ q /m} > H m (t) ■=m- 1 Y?=2Mx 1 <t} for all t, and that, by 
definition of ^(jW) and (1.2), we have 

> x > = < max > 1 



7T7 — 1 J [t=qr /rn,...,q(m— l)/m t 

for x £ ((? — l)/(m — 1), 7-/(777, — 1)]. Since 

H m (t) — t 



max 



t=qr /m,...,q(m—l) /m,q t 

H m (t)-t H m {q)-q 



= max< max 

{t=qr /m,...,q(m— l)/m t 

f H* m (t)-t 
< max< max , 

yt=qr /m,...,q(m— l)/m t 

H m (q(m — l)/m) — q(m — l)/m (m — 1) 1 
q{m — 1) /777 777 m 



we have 



1 H m {t)-t 
1 < max 

q t=qr /m,...,q(m—l)/m,q t 

1 1 / H^(t) - 1 
=> 1 < max , 

q t=qr/m,...,q(m—l)/m t 

and because sup t \H m (t) — H m (t)\ — > with probability one (and q^ > 
qx/2), it follows that P(R m /m > x{m-l)/m)+e < P(R$(X&)/(m - 1) > 
x) for sufficiently large m and arbitrary e > 0. This proves that R m — > 

^(XW) 4 00; similar reasoning shows that ^(1^) 
J Rii +1) (X^' +1 )) £ 00. Thus, # m 4 00 implies ^(I^) 4 00 for each j, 
and by the bounded convergence theorem E[(j + (X^)y~ k ] -► when- 
ever 1 < j < /c, so (2.3) follows from (2.1). In the standard uniform case 

equality holds in (2.3) with "lim" in place of "limsup," whence H\ y m — ► qj. 
To prove the converse, we show that n 1)Tn — > 97 =^ R m (X ( ') — ► 00 and 

(1) f 1 \ P P P 

then that i?m (X 1 - ') — > 00 =4> R m — » 00. Suppose IIi jm — » 57, and assume 
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lim sup,^^^ < oo in probability. Then (2.1) with k = 2 and 



in the standard uniform case implies 

lim inf E [U? m 1 = 07 lim inf E 1 

m^oo L ±,rTlJ m— >oo 



> 



1 + C 



1 + B^(XW] 

+ {qif > {qif, 



which contradicts III m -> 97; thus, i^i (I' 1 ') -> 00. When k = l, (2.2) in 
the standard uniform case reads 

E(S m ) 



(A.l) 



qj. 



p p (1) /i \ p 

If R rn f* 00 then 5 m /> 00, but then (X^ >) — > 00 contradicts (A.l) when 
we let m — > 00; thus we must have R m — > 00 if i?m (X ) — * °°- 
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